Precision of Hubble constant derived using black hole binary absolute distances and 

statistical redshift information 
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Measured gravitational waveforms from black hole binary inspiral events directly determine abso- 
lute luminosity distances. To use these data for cosmology, it is necessary to independently obtain 
redshifts for the events, which may be difficult for those without electromagnetic counterparts. Here 
it is demonstrated that certainly in principle, and possibly in practice, clustering of galaxies allows 
extraction of the redshift information from a sample statistically for the purpose of estimating mean 
cosmological parameters, without identification of host galaxies for individual events. We extract 
mock galaxy samples from the 6th Data Release of the Sloan Digital Sky Survey resembling those 
that would be associated with inspiral events of stellar mass black holes falling into massive black 
holes at redshift 2 ~ 0.1 to 0.5. A simple statistical procedure is described to estimate a likelihood 
function for the Hubble constant Hq: each galaxy in a LISA error volume contributes linearly to 
the log likelihood for the source redshift, and the log likelihood for each source contributes linearly 
to that of Hq. This procedure is shown to provide an accurate and unbiased estimator of Hq. It is 
estimated that a precision better than one percent in Hq may be possible if the rate of such events 
is sufficiently high, on the order of 20 to z — 0.5. 

PACS numbers: 98.80.Es 



PRECISION COSMOLOGY FROM BLACK 
HOLE BINARIES 



A low frequency gravitational wave detector, such as 
LISA, is capable of measuring very high signal to noise 
waveforms from inspirals and mergers of cosmologically 
distant black hole binaries. From the measured wave- 
form alone it is possible to estimate the parameters of 
a binary with high precision, including its direction on 
the sky and its absolute luminosity distance [ll]. Aside 
from numerical factors, the absolute radius of the final 
hole is fixed by the square of the orbital period divided 
by the orbit decay or chirp time; the distance is this ab- 
solute length divided by the measured wave amplitude. 
The gravitational calibration of distance is not accompa- 
nied by the usual systematics associated with astronom- 
ical modeling; indeed it does not even require Standard 
Model physics. Since a single black hole binary merger 
can provide distances with absolute precision much bet- 
ter than one percent, this capability may offer a poten- 
tially transformative tool forprecision measurement of 
cosmological parameters [i |3| (for reviews on gravita- 
tional waves see Ref.s ilia). .Precision measure- 
ment of an absolute distance scale, as embodied in the 
Hubble constant, is important for breaking degeneracies 
in estimates of cosmological parameters (such as those 
characterizing cosmological curvature and Dark Energy) 
using other datasets, such as cosmic background radia- 
tion anisotropy d, Q . 

On the other hand there are obstacles in practice to 
applying the technique. One problem is distance errors 
added by gravitational lensing along the line of sight 
p^ . The best raw distances are given by massive black 



hole (MBH) binary inspiral events (where both holes are 
larger than W^Mq say), many of which will be measured 
with very high signal-to-noise ratio [1, [ll|, [H, [H, [H, [lB| ■ 
They are predicted to be fairly frequent (one or two 
events per week on average) and should be observable 
with LISA out to redshifts greater than 10 [H, Un- 
fortunately most of the observable events are predicted 
to occur at redshifts greater than 2 where errors due to 
lensing are substantial. Thus one is led to consider an- 
other class of events, stellar mass black holes inspiralling 
into massive black holes (the so-called extreme mass ra- 
tio inspiral or EMRI events [H, [H, HH, HI]), which 
occur frequently in galaxies at z < 1 where lensing er- 
rors are subdominant. Although the intrinsic precision 
of these distances is lower due to lower signal-to-noise 
ratio, they still may provide a unique and precise cosmo- 
logical probe. 

The other major uncertainty is associated with mea- 
suring the redshifts of the events. The gravitational 
waveform provides a measurement of luminosity dis- 
tance, and a cosmological probe such as a mean redshift- 
distance relation requires an independent measurement 
of redshift. In the case of MBH events, it may be possi- 
ble to identify the host galaxy by seeking electromagnetic 
counterparts [Ulli, [H, HE Ha, [H, [11 to the merger, 
such as optical or x-ray variability in accretion disks from 
the rapid evolution in the gravitational potential of the 
binary black hole, or even from the sudden removal of 
several percent of Mc^ in gravitational radiation at the 
moment of merger. But in the case of EMRI events, no 
compelling model requiring an electromagnetic counter- 
part has been offered: it may be possible to merge a small 
black hole with a large one with practically no signature 
aside from the gravitational radiation itself. 
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The point of this paper is to sketch and demonstrate 
a technique for measuring cosmological parameters such 
as the Hubble constant Hq by obtaining only statisti- 
cal information about the redshifts of the EMRI hosts, 
even without identifying the host galaxy of any individ- 
ual event. For each event, the estimate of direction 9 
and distance , together with a cosmological model re- 
lating Dl and redshift z roughly estimated from other 
techniques, provide a three-dimensional "error box" in 
9, z space. As first noted in [ij , a galaxy redshift survey 
in this error box then provides a statistical estimate of 
the host redshift, since galaxies are highly clustered with 
each other. Here we demonstrate using a real survey 
a technique for estimating the likelihood distribution of 
the host galaxy redshift, which is more accurate (statis- 
tically) than the prior information on Hq that went into 
constructing the error box. 

In order to demonstrate the utility of this technique 
in practice, we construct mock LISA error boxes in the 
Sloan Digital Sky Survey (SDSS) volume 0, [HI, [13, [H, 
[s^ . [35!. ^3u\ and generate mock redshift surveys from sam- 
ples of SDSS galaxies that have about the same statistical 
clustering properties as a host galaxy population of LISA 
EMRI sources. One reason to choose a real galaxy sur- 
vey rather than a simulation for these realizations is that 
the higher order correlations between galaxies in the cos- 
mic web, which are important for determining the likeli- 
hood distribution of redshifts, are known to be correct, 
as long as the SDSS catalog selection approximates an 
unbiased sample of a typical EMRI host galaxy popu- 
lation. We find that with enough events, precision bet- 
ter than one percent is possible in measuring the mean 
redshift-distance relation. This translates into compa- 
rable precision in Hq, by a technique that shares very 
few systematic errors or biases with other means of mea- 
suring Hq. It also enables other new cosmological tests, 
for example, direct measurement of cosmic acceleration 
within the redshift range (z < 0.5) where the universe is 
dominated by Dark Energy. 



II. USE OF STATISTICAL REDSHIFTS FOR 
COSMOLOGICAL PARAMETER ESTIMATION 
FROM GRAVITATIONAL WAVEFORMS 

A fit to a gravitational wave signal from a black 
hole inspiral event j leads to a likelihood distribution 
\iiCj{9, Dl) for its angular location 9 and luminosity dis- 
tance Dl. We wish to combine this with information 
from the directions and redshifts of galaxies to measure 
mean cosmological parameters such as the Hubble con- 
stant Hq. 

For the simple realizations shown in Section III., the 
selection function is highly idealized: all galaxies are as- 
sumed to be equally likely hosts, within a certain error 
box: the angular size of the box is defined by LISA er- 
rors in angle, and the depth of the box is determined 
primarily by prior errors on Hq from other sources. For 



this discussion, intended only to estimate typical errors, 
we also assume linear Hubble fiow, in particular negligi- 
ble cosmic acceleration (though this is taken into account 
where needed in the scaling exercise below; the linear dis- 
cussion is valid as long as acceleration is negligible within 
each redshift error box). With this simple selection and 
expansion model, the log-likelihood distribution for the 
Hubble constant for each event j is (up to a constant 
numerical factor) 

\nC,{HQ) = iVri ^ \nC,{D, = cz,/Hq), (1) 

i 

summed over the galaxies i in the box, and for a whole 
sample. 

In C{Hq) ^J2Y1 ^i' 1^ ^ ^^«/^o)' (2) 

i 3 

where a normalization factor Nj , the number of galaxies 
actually measured in each box, is included so that each 
event is weighted equally independent of the number of 
galaxies measured. Thus each galaxy gets to vote appro- 
priately on its source redshift, while each EMRI source 
contributes on an equal footing to estimating a value for 

Hq. 

If the redshift distribution were very smooth on the 
scale of the LISA error boxes, this technique would not 
return useful information on redshift: the likelihood dis- 
tribution of Hq would be the same as that going into 
construction of the error boxes. However, actual galaxies 
are highly clustered so if the angular errors are not too 
large, there is recoverable statistical information on z as 
demonstrated quantitatively below. 

There is a possibility of some bias in the estimate of 
Hq, for example if the rate of EMRIs is rapidly evolving 
with redshift, which means the weighting within each box 
should not be uniform in z. Since the mean evolution 
will actually be measured, such biases can be measured 
and accounted for statistically; however, we do not model 
them here. 



III. MOCK LISA/EMRI HOST-GALAXY 
SAMPLES FROM SDSS 

If galaxies are sufficiently clustered, then within the 
region of 9i, Di allowed for each event, a redshift catalog 
can give an estimate of the host redshift without knowing 
which galaxy is the host. The question is whether for 
realistic galaxy clustering and LISA error boxes, there 
is enough redshift information to be useful. We answer 
this question by generating mock redshift samples from 
regions of the SDSS volume scaled assuming standard 
cosmology to have about the same clustering properties 
as EMRI hosts. 

We use estimates of LISA errors in distance and 
sky position from Cutler (private communication). For 
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10 + 10^ Mq EMRIs at a redshift z, and a fully function- 
ing LISA with two effective synthetic interferometers, we 
use an error box with a solid angle AO"^ « Az'^ square de- 
grees, and assume distance error Aln(Di) « 0.05z. For 
a LISA with only one synthetic interferometer, we adopt 
« 16z^ square degrees, and Aln(_DL) « 0.07z. It 
should be noted that these estimates only define a 63% 
confidence interval; therefore, in a proper simulation one 
should increase the size of the error box by some amount 
to improve the confidence limits. However, in our ex- 
ploratory analysis we neglect this correction. 

To generate a mock EMRI catalog, a galaxy is selected 
at random in the SDSS catalog and an error box is gen- 
erated centered at this redshift and direction. Examples 
of simulated error boxes for the single synthetic interfer- 
ometer case are shown in Fig. [TJ While the ranges in 
right ascension and declination are calculated using Cut- 
ler's estimates above, the range in redshift corresponds to 
an error in Hq of about 7% (representing a cosmological 
prior obtained from other techniques). A histogram is 
then generated from the SDSS galaxies in this box, with 
the originally selected host galaxy removed. (Thus we 
assume that the actual host may not even be a visible 
galaxy, only that it correlates with other galaxies in the 
usual way.) This gives the likelihood distribution for the 
EMRI host redshift and, according to (1), gives In Cj (Hq) 
for this event. Assuming a value Hq = 70 km s~^ Mpc~^, 
a linear Hubble's Law, and the same distance for all 
galaxies in the box (calculated from the host redshift), we 
convert redshifts into Hubble units. Thus our histograms 
display a likelihood distribution for Hq estimated for a 
sample, which should be compared with a "true" value 
of 70 km s"^ Mpc-^ 

To simulate the LISA distance error for each source, 
each individual histogram is displaced by a random 
amount, selected from an interval corresponding to an 
error of AHq/Hq = 0.05z for two synthetic interfer- 
ometers, and AHq/Hq = 0.07 z for one synthetic in- 
terferometer. As an example for the one interferom- 
eter case, given a redshift of 0.2 the error would be 
AHq = 0.98 km s~^ Mpc~^, and the histogram (that is, 
each "box") is displaced by a random amount between 
-0.49 and -1-0.49 in Hq. This error is applied before es- 
tablishing the edges of the histogram at Hq = 65 and 
Hq ~ 75 km s~^ Mpc~^, so that new galaxies are al- 
lowed to enter the histogram due to the shift in Hq. 

A set of such samples can be combined by stacking 
their histograms (appropriately normalized to give all 
events equal weight independent of the number of galax- 
ies in their sample, as in (1)). Since each box is gener- 
ated with the originally chosen source galaxy assuming 
Hq = 70 km s~^ Mpc~^, a peak should emerge around 
this value in the stacked histogram. The width of the 
summed distribution measures the width of the distribu- 
tion of galaxies, and the offset of the fitted mean from 
the center (70 km Mpc~^) measures the actual offset 
in the "measured" from the "true" value of Hq for each 
set of realizations. 



Results of several mock realizations are displayed in 
Fig. m For each of these sums, 20 "EMRIs" were ran- 
domly selected in the SDSS (DR6) spectroscopic survey 
volume (673,264 galaxies total, classified spectroscopi- 
cally), excluding the distant part of the survey where 
galaxy selection is dominated by different criteria. The 
(spectroscopic) redshifts were limited to 0.02 < z < 0.23, 
and all mock error boxes were limited to the regions of 
the northern galactic cap covered by DR6 (shown on 
the SDSS website, which is listed under 'Acknowledge- 
ments'). The SDSS provides a statistically complete 
sample for galaxies with r-band Petrosian magnitudes 
r < 17.77 [32|. In our analysis we neglect peculiar ve- 
locities of galaxies, which are likely to add errors of less 
than about 1% per object even for the nearest plausible 
samples and are therefore subdominant. 

Each individual error box that was constructed yields 
ln£j(iJo), displayed as a histogram which plots the num- 
ber of galaxies versus Hq. The originally chosen source 
galaxy was subtracted from each histogram, and there- 
fore error boxes that contained only the chosen source 
galaxy were ignored. This is reflected in the final box 
count (the initial box count is 20), which is listed above 
each plot. Each of these histograms were then normal- 
ized by dividing by the total number of galaxies contained 
in each error box (Nj), not including the original source 
galaxy. Individual histograms were then added together, 
resulting in a plot of ln£(7Jo) that contains information 
purely from galaxy clustering. 

The results from these realizations are summarized in 
Table |T1 where the average redshift of the chosen source 
galaxies is included, as well as the total number of galax- 
ies used in the summed histogram. It can be seen that the 
samples cluster near the "true" value of 70 km s"^ Mpc'^ 
and would allow a measurement of Hq with a typical er- 
ror in the mean of about 0.2 km s~^ Mpc~^ (averaged 
over 25 realizations for the one interferometer case), or 
a precision of about one third of a percent. This shows 
that the statistical redshift technique is credible with a 
realistic galaxy distribution and LISA errors. 

As expected, the distribution of candidate Hubble con- 
stants is highly nongaussian, so a sample at least this 
large is needed for a reliable result. In order to get a 
sample of order 10 or more EMRIs at such a low red- 
shift, the overall rate of EMRIs would have to be at the 
high end of current estimates 22]. 

Therefore, in a second round of experiments, we tested 
the reliability of the above technique for larger redshifts, 
still using the same SDSS data (with z limited to 0.02 < 
z < 0.23) but scaling to estimate higher redshift behavior 
in the same galaxy population. We chose to scale to 
higher z from this SDSS range since the SDSS does not 
sample typical galaxy populations at z ^ 0.23. 

Adopting a larger redshift value for a fiducial EMRI 
event increases the LISA angular and distance errors 
according to Cutler's estimates, and at the same time 
requires a scaling of the SDSS data in order to esti- 
mate the properties of a similarly-clustered galaxy host 
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FIG. 1: Simulated LISA error boxes for one synthetic interferometer. In each box the chosen source galaxy lies at the center, 
indicated by an asterisk, and dots represent all other galaxies in the box. Boxes are scaled to represent events at zlisa = 
AzsDss, where A is specified above each box, and the pretend LISA angular coordinates, given by Olisa = {^/A)6sdss-, are 
plotted. 
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FIG. 2; Summed liistograms (bin size — 0.1) with Gaussian fits overlayed. Top row: one synthetic interferometer case; bottom 
row: two synthetic interferometers. The true value of Ho was assumed to be 70 km s~^ Mpc~^. Gaussian fits were generated 
using a Levenberg-Marquardt technique to iteratively search for a best least-squares fit to ln£(//o). The starting value for the 
mean was a random value between 68 and 72, and those for the width and area were both 1. Vertical axes are in arbitrary 
units. 



TABLE I: Results for realizations in Fig. |21 starting from top left. 



Number of Boxes Average z 


EiV, 


Width of Gaussian 


Mean Ho 


(Sources) 








[km s"^ Mpc-i] 


15 


0.125 


72 


0.28 


69.83 


16 


0.109 


74 


0.54 


70.11 


15 


0.135 


103 


0.66 


70.27 


10 


0.119 


46 


0.59 


70.00 


13 


0.113 


32 


0.80 


69.64 


10 


0.114 


21 


0.50 


69.58 



population at larger redshift. For one synthetic inter- 
ferometer, Cutler's estimates give us an angular error 
'-^(^LiSA — '^zi^jsA degrees on each side of an error 
box. When scaling by a factor ^ to a source redshift 
zlisa — AzsDSS, where zsdss is the redshift of a ran- 
dom host galaxy in SDSS, the LISA angular error is 

Mlisa = 4:AzsDss degrees. (3) 

To estimate the angle corresponding to the galaxy struc- 
ture at higher redshift in this angular box, we assume 
a ACDM cosmology with a few simplifications. Assume 
that the galaxy distribution is frozen in comoving coor- 



dinates, which is approximately true in a A dominated 
universe. Then a structure that has an angular size 9 lis a 
at redshift zlisa will appear at zsdss with an angular 
size 

QsDss — (^lisa{DlisaI Dsdss)[{^+zlisa)/{'^+zsdss)] 

(4) 

where angular size distances are denoted by D. One can 
verify numerically in ACDM that within 10%, this agrees 
with a simple formula, 

OsDSS — OLISAizLISA/ zsdss), (5) 

which can be obtained by using a small z expansion and 
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neglecting higher order terms in z (which is appropriate 
since the range in 2; in a LISA error box is small) . There- 
fore, combining (3) and (5), the angular size we use on 
one side of an SDSS error box in our scaled realizations 
scales as 

'^OsDSS = AOlisaA = 'iA'^zsDss degrees. (6) 

The LISA distance error was taken into account us- 
ing the same method as before, except now scaled to 
Aln(Di) « 0.07 AzsDSS for one synthetic interferome- 
ter. The results of these realizations are shown in Fig. [31 
where scalings of A =2, 3, 4, and 5 were used. 

Results are summarized in Table HIl It can be seen that 
these are much more realistic than the lower-redshift real- 
izations, with typical boxes containing hundreds of EMRI 
host tracer galaxies. Nevertheless, the final Hq estimate 
remains precise even for redshifts out to zljsa — 0.5. It 
should be noted that since these error boxes are much 
larger and are still limited to the northern galactic cap, 
significant overlapping of error boxes occurs when scal- 
ing to high redshift, primarily for A ^ 5. Therefore, the 
results for these cases are not as reliable due to lack of 
independence. 

It should also be noted that even for these larger error 
boxes, there are still cases where few galaxies (< 10) 
are contained within a box, and occasionally there will 
only be one to two galaxies. These galaxies contribute 
significantly to the final, summed histograms. Figure 2] 
shows histograms of the number of galaxies contained in 
each box, that is, the distributions of Nj values, for each 
of the plots in Fig. O 

The resulting errors in the mean {6ho = |70 — mean\) 
averaged over 25 realizations for each value of A are listed 
in Table lllli all for the single synthetic interferometer 
case. 

In conclusion, we have found that if LISA detects 20 
or more EMRI events to a redshift of z « 0.5, galaxy 



surveys of the LISA error boxes are likely to yield a reli- 
able estimate of the Hubble constant to better than one 
percent precision. A higher rate of EMRI events would 
permit estimates of cosmic acceleration from the redshift- 
distance relation in this redshift range with considerably 
more precision than other known techniques. 
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FIG. 3: Summed histograms for one synthetic interferometer, scaled to represent events at zlisa = ^zsdss (top row), 
zlisa = SzsDSS (second row), zlisa = ^zsdss (third row), and zlisa = 5zsdss (bottom row). Vertical axes are in arbitrary 
units. 
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FIG. 4: Distributions of Nj values for realizations in Figure [3] (in the same order). 
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TABLE 11: Results for realizations in Fig. [S] starting from top left and going through each row left to right. 



A Number of Boxes Average zsdss 


Average zlisa 


EiV, 


Width of Gaussian 


Mean Hq 


(Sources) 










[km s"^ Mpc"^] 


2 20 


0.115 


0.231 


712 


0.97 


69.74 


2 18 


0.103 


0.206 


532 


0.89 


70.03 


2 18 


0.105 


0.211 


525 


0.54 


69.81 


3 20 


0.111 


0.333 


2783 


1.22 


69.55 


3 20 


0.115 


0.344 


2454 


1.11 


69.97 


3 20 


0.088 


0.263 


2320 


1.51 


69.97 


4 20 


0.086 


0.344 


4794 


0.93 


69.75 


4 20 


0.108 


0.434 


7361 


0.82 


70.16 


4 20 


0.099 


0.396 


6668 


1.21 


70.69 


5 20 


0.112 


0.558 


14648 


0.55 


70.46 


5 20 


0.115 


0.577 


16929 


1.17 


68.90 


5 20 


0.088 


0.442 


11048 


1.42 


70.00 



TABLE III: Errors in the mean Hq averaged over 25 realizations, for one synthetic interferometer. 

A 5ho [km s"^ Mpc"^] 

1 0.22 

2 0.16 

3 0.24 

4 0.51 

5 0.68 
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